Search CORE

123 research outputs found

Providing Diversity in K-Nearest Neighbor Query Results

Author: Haritsa Jayant R.
Jain Anoop
Sarda Parag
Publication venue
Publication date: 15/10/2003
Field of study

Given a point query Q in multi-dimensional space, K-Nearest Neighbor (KNN) queries return the K closest answers according to given distance metric in the database with respect to Q. In this scenario, it is possible that a majority of the answers may be very similar to some other, especially when the data has clusters. For a variety of applications, such homogeneous result sets may not add value to the user. In this paper, we consider the problem of providing diversity in the results of KNN queries, that is, to produce the closest result set such that each answer is sufficiently different from the rest. We first propose a user-tunable definition of diversity, and then present an algorithm, called MOTLEY, for producing a diverse result set as per this definition. Through a detailed experimental evaluation on real and synthetic data, we show that MOTLEY can produce diverse result sets by reading only a small fraction of the tuples in the database. Further, it imposes no additional overhead on the evaluation of traditional KNN queries, thereby providing a seamless interface between diversity and distance.Comment: 20 pages, 11 figure

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications

Recommended from our members

Understanding Open Defecation in the Age of Swachh Bharat Abhiyan: Agency, Accountability, and Anger in Rural Bihar.

Author: Jain Anoop
Ray Isha
Snell-Rood Claire
Wagner Ashley
Publication venue: eScholarship, University of California
Publication date: 01/02/2020
Field of study

Swachh Bharat Abhiyan, India's flagship sanitation intervention, set out to end open defecation by October 2019. While the program improved toilet coverage nationally, large regional disparities in construction and use remain. Our study used ethnographic methods to explore perspectives on open defecation and latrine use, and the socio-economic and political reasons for these perspectives, in rural Bihar. We draw on insights from social epidemiology and political ecology to explore the structural determinants of latrine ownership and use. Though researchers have often pointed to rural residents' preference for open defecation, we found that people were aware of its many risks. We also found that (i) while sanitation research and "behavior change" campaigns often conflate the reluctance to adopt latrines with a preference for open defecation, this is an erroneous conflation; (ii) a subsidy can help (some) households to construct latrines but the amount of the subsidy and the manner of its disbursement are key to its usefulness; and (iii) widespread resentment towards what many rural residents view as a development bias against rural areas reinforces distrust towards the government overall and its Swachh Bharat Abhiyan-funded latrines in particular. These social-structural explanations for the slow uptake of sanitation in rural Bihar (and potentially elsewhere) deserve more attention in sanitation research and promotion efforts

eScholarship - University of California

Face Cartoonisation For Various Poses Using StyleGAN

Author: J Ankith Varun
Jain Kushal
Namboodiri Anoop
Publication venue
Publication date: 26/09/2023
Field of study

This paper presents an innovative approach to achieve face cartoonisation while preserving the original identity and accommodating various poses. Unlike previous methods in this field that relied on conditional-GANs, which posed challenges related to dataset requirements and pose training, our approach leverages the expressive latent space of StyleGAN. We achieve this by introducing an encoder that captures both pose and identity information from images and generates a corresponding embedding within the StyleGAN latent space. By subsequently passing this embedding through a pre-trained generator, we obtain the desired cartoonised output. While many other approaches based on StyleGAN necessitate a dedicated and fine-tuned StyleGAN model, our method stands out by utilizing an already-trained StyleGAN designed to produce realistic facial images. We show by extensive experimentation how our encoder adapts the StyleGAN output to better preserve identity when the objective is cartoonisation

arXiv.org e-Print Archive

Self-consistency for open-ended generations

Author: Deoras Anoop
Jain Siddhartha
Ma Xiaofei
Xiang Bing
Publication venue
Publication date: 11/07/2023
Field of study

In this paper, we present a novel approach for improving the quality and consistency of generated outputs from large-scale pre-trained language models (LLMs). Self-consistency has emerged as an effective approach for prompts with fixed answers, selecting the answer with the highest number of votes. In this paper, we introduce a generalized framework for self-consistency that extends its applicability beyond problems that have fixed-answer answers. Through extensive simulations, we demonstrate that our approach consistently recovers the optimal or near-optimal generation from a set of candidates. We also propose lightweight parameter-free similarity functions that show significant and consistent improvements across code generation, autoformalization, and summarization tasks, even without access to token log probabilities. Our method incurs minimal computational overhead, requiring no auxiliary reranker models or modifications to the existing model

arXiv.org e-Print Archive

Relation between Blood Lead Levels and Childhood Anemia in India

Author: Garshick Eric
Guller Ulrich
Jain Nitin B.
Kazani Shamsah
Laden Francine
Shankar Anoop
Publication venue
Publication date: 02/08/2017
Field of study

Lead pollution is a substantial problem in developing countries such as India. The US Centers for Disease Control and Prevention has defined an elevated blood lead level in children as ≥10 μg/dl, on the basis of neurologic toxicity. The US Environmental Protection Agency suggests a threshold lead level of 20-40 μg/dl for risk of childhood anemia, but there is little information relating lead levels <40 μg/dl to anemia. Therefore, the authors examined the association between lead levels as low as 10 μg/dl and anemia in Indian children under 3 years of age. Anemia was divided into categories of mild (hemoglobin level 10-10.9 g/dl), moderate (hemoglobin level 8-9.9 g/dl), and severe (hemoglobin level <8 g/dl). Lead levels <10 μg/dl were detected in 568 children (53%), whereas 413 (38%) had lead levels ≥10-19.9 μg/dl and 97 (9%) had levels ≥20 μg/dl. After adjustment for child's age, duration of breastfeeding, standard of living, parent's education, father's occupation, maternal anemia, and number of children in the immediate family, children with lead levels ≥10 μg/dl were 1.3 (95% confidence interval: 1.0, 1.7) times as likely to have moderate anemia as children with lead levels <10 μg/dl. Similarly, the odds ratio for severe anemia was 1.7 (95% confidence interval: 1.1, 2.6). Health agencies in India should note the association of elevated blood lead levels with anemia and make further efforts to curb lead pollution and childhood anemi

RERO DOC Digital Library